Automotive engine misfire detection system including a bit-serial based recurrent neuroprocessor

ABSTRACT

An engine diagnostic system includes a bit-serial based recurrent neuroprocessor for processing data from an internal combustion engine in order to diagnose misfires in real-time and reduces the number of neurons required to perform the task by time multiplexing groups of neurons from a candidate pool of neurons to achieve the successive hidden layers of the recurrent network topology.

This application claims the benefit of Provisional application Ser. No.60/029,593, filed Oct. 23, 1996.

TECHNICAL FIELD

This invention relates generally to detection of misfires in automotiveinternal combustion engines and, more particularly, to a bit-serialbased recurrent neuroprocessor for processing data from an internalcombustion engine in order to diagnose misfires in automotive engines inreal-time.

BACKGROUND ART

Misfire diagnostics in internal combustion engines requires thedetection and identification of improper combustion events in eachfiring cylinder. In order to utilize information from sensors now inproduction use, the diagnostic is based upon analysis of crankshaftdynamics. Misfire detection relies on computing the derivatives ofcrankshaft position sensor signals to determine short term averagecrankshaft accelerations several times each revolution. Theseaccelerations are computed over rotational segments spanning theintervals between engine cylinder firings.

While analysis of crankshaft dynamics permits detection of enginemisfire in many circumstances, the signal signatures of these events areobscured by complicated dynamics arising from torsional oscillationsoccurring in the crankshaft itself as a result of the excitation of itsintrinsic normal modes. Since detection of misfires is based upon theprinciple that engine misfires cause a torque impulse to be absent, witha consequent torque or acceleration deficit, the detection of thedeficit forms the basis of the misfire diagnostic. The diagnosticalgorithms must detect the acceleration deficit in such a manner as toreliably detect the engine misfires, while recognizing the morefrequently occurring normal combustion events. Analysis of the dynamicsmust result in failure detecting capabilities in excess of 95% of allfailure events, with simultaneous identification of normal events asnormal, with accuracies approaching 99.9%.

SUMMARY OF THE INVENTION

In accordance with the present invention, an engine diagnostic system isprovided that includes a neuroprocessor capable of efficientlyperforming the required computations for detecting engine misfire and/orperforming other diagnostic functions. The architecture and hardware issufficiently flexible to be able to perform the misfire diagnostic taskand still have the capability of performing other diagnostics or controlfunctions in automotive systems such as idle speed control and air/fuelratio control. This flexibility is achieved through the use of a highspeed hardware realization of basic neural network blocks or units andtime-multiplexing these blocks to form the specific neural architecturerequired for any designated task.

More specifically, the neuroprocessor achieves its compactness and costeffectiveness by employing a combination of bit-serial and bit-paralleltechniques in the implementation of the neurons of the neuroprocessorand reduces the number of neurons required to perform the task by timemultiplexing groups of neurons from a fixed pool of neurons to achievethe successive hidden layers of the recurrent network topology. For mostrecurrent neural network vehicular applications such as misfiredetection, a candidate pool of sixteen silicon neurons is deemed to besufficient. By time multiplexing, the sixteen neurons can be re-utilizedon successive layers. This time-multiplexing of layers radicallystreamlines the VLSI architecture by significantly increasing hardwareutilization through reuse of available resources.

BRIEF DESCRIPTION OF THE DRAWINGS

A more complete understanding of the present invention may be had fromthe following detailed description which should be read in conjunctionwith the drawings in which:

FIG. 1 is a plot showing instantaneous acceleration versus crankshaftrotation for normal and misfiring cylinders;

FIG. 2 is a schematic and block diagram showing measuring andcalculating apparatus according to the present invention;

FIG. 3 is a block diagram showing the various inputs to a recurrentneural network for diagnosing engine misfires in accordance with thepresent invention;

FIG. 4 shows the topology of a recurrent neural network of the presentinvention;

FIG. 5 is a block diagram of the bit-serial architecture of a neuronused in the recurrent neural network;

FIG. 6 depicts the time-multiplexed layer technique used in the presentinvention;

FIG. 7 is a block diagram of the neuroprocessor;

FIG. 8 is a block diagram of the global controller of theneuroprocessor;

FIG. 9 is a block diagram of the run time controller of the globalcontroller;

FIG. 10 is a schematic diagram of the multiplier of FIG. 5; and

FIG. 11 is a schematic diagram of the accumulator of FIG. 5.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENT

A typical 4-stroke combustion engine cycle includes the intake stroke,the compression stroke, the power stroke, and the exhaust stroke. Asshown in FIG. 1, the power strokes of the respective cylinders arearranged in a particular order according to crankshaft position.Furthermore, in any engine having more than four cylinders, the powerstrokes of different cylinders will overlap. One engine cycle iscomprised of 720° of crankshaft rotation during which each cylinderpasses through each of its four strokes.

Curve 10 in FIG. 1 shows approximate acceleration fluctuation duringengine operation. An acceleration peak 12 occurs during the firinginterval of cylinder No. 1 and other maximums in the acceleration curveoccur approximately corresponding to each other properly firingcylinder. When a misfire occurs such that no significant power iscreated by a cylinder during its firing interval, the crankshaftdecelerates as indicated at 14.

Crankshaft based misfire detectors have advantageously employed measuredrotation intervals occurring at a frequency of about once per cylinderfiring rather than attempting to measure instantaneous values as shownin FIG. 1. FIG. 2 shows an apparatus for measuring velocity andobtaining corrected acceleration values as described more fully in U.S.Pat. No. 5,531,108, which is incorporated herein by reference. An enginerotation position sensing system includes a rotor 20 having vanes 22,24, and 26 which rotate with a crankshaft 28 (a 3-vane rotor from a6-cylinder engine is shown in this example). Vanes 22-26 pass between ahall-effect sensor 30 and a permanent magnet 32 to generate a profileignition pulse (PIP) signal as crankshaft 28 rotates. Vanes 22-26 arearranged to generate a rising edge in the PIP signal at a predeterminedposition in relation to top dead center of each respective cylinder. ThePIP signal actually indicates the approach to top dead center of twoengine cylinders, one of which is approaching a power stroke and one ofwhich is approaching an intake stroke since it takes two full crankshaftrotations to complete an engine cycle.

A cylinder identification (CID) sensor 34 is connected to a camshaft 36for identifying which of the two cylinders is actually on its powerstroke. Camshaft 36 rotates once for every two rotations of crankshaft28. The resulting CID signal is preferably generated having a risingedge corresponding to the power stroke of cylinder No. 1. A timer 38receives the PIP signal and the CID signal and measures elapsed timebetween predetermined engine position locations as determined by the PIPand CID signals. The elapsed time ΔT_(i) for each velocity measuringinterval “i” is output from timer 38 to a velocity and accelerationcalculator 40. Preferably, timer 38 and velocity and accelerationcalculator 40 are implemented as part of a micro-controller with anassociated memory 42 for storing correction factors, other data, andsoftware instructions. An alternative position sensing apparatus mayinclude a multi-toothed wheel mounted on the engine for rotation withthe crankshaft, as disclosed in the aforementioned patent.

Referring to FIG. 3, the misfire detection system of the presentinvention includes a recurrent neural network or neuroprocessor,generally designated 50, that has been trained off-line as aclassification system to recognize engine misfires. The network 50 is acustom VLSI CMOS application specific integrated circuit (ASIC), thatwill be described more fully hereinafter. The four inputs to the networkare engine speed (RPM), engine load (LOAD), crankshaft acceleration(ACCEL) and cylinder identification (CID). As previously indicated, CIDenables the network to determine which cylinder is being observed. Theoutput of the network 50 is a value that is positive or negativeindicating the presence or absence of a misfire.

As shown in FIG. 4, the topology of the recurrent neural network 50 forthe misfire application is a multilayer perceptron, with 4 inputs, afirst hidden layer 52, a second hidden layer 54, and a single outputnode. All node activation functions are bipolar sigmoids. Each hiddenlayer contains a plurality of nodes and is completely recurrent, therebyproviding internal feedback through unit time delays between each nodeas indicated by the Z⁻¹ blocks. In effect, the output of each nodeserves as input to each node of the same layer at the next time step.

The target output for the network is +1 or −1 according to whether amisfire was present or absent respectively, a specified number of timesteps earlier, for example 8. This temporal offset, determined frompreliminary training trials, permits the network to make use ofinformation following the specific event it is to classify, becauseshort lived disturbances cause effects which persist for an enginecycle.

The preferred network architecture for misfire detection is 4-15R-7R-1,i.e., 4 inputs, 2 fully recurrent hidden layers with 15 and 7 nodes, anda single output node. The network executes once per cylinder event(e.g., 8 times per engine cycle for an 8-cylinder engine). The inputs attime step “k” are the crankshaft acceleration (ACCEL), averaged over thelast 90 degrees of crankshaft rotation, engine load (LOAD), computedfrom the mass flow of air, engine speed (RPM), and a cylinderidentification signal (CID), e.g., 1 for cylinder 1, 0 otherwise, whichallows the network to synchronize with the engine cylinder firing order.This network contains 469 weights; thus one execution of the networkrequires 469 multiply-accumulate operations (MAC) and 23 evaluations ofthe activation function, a computational load of 187,000 MAC's⁻¹.

The neural network is trained off-line with a standard digital computerand the trained weights are then stored in vehicle computer memoryaccessible to the neural chip hardware. In order to obtain optimumperformance, the precision of the arithmetic in the off-line processorsis matched to the bit precision of the hardware. The network is trainedusing derivatives computed by an extension of the well-known real timerecurrent learning method (also known as “dynamic backpropagation”) withweight updates based on the extended Kalman filter (EKF) procedure. Themulti-stream training method is also employed. Multi-stream trainingprocedures coordinate updates based jointly on data streams fromdifferent engine operating conditions. In the present case, 40 streamswere used, each stream formed by randomly choosing a starting positionin a file chosen randomly from the 25 files used. Periodically, newstream assignments are made, for example, after 250 instances from eachstream have been processed. After such reassignment, recurrent nodeactivations are initialized to zero and some number of instances areprocessed by the network without weight updates, in order to “prime” thenetwork, i.e., to allow reasonable node activations to be formed. Eachweight update step attempts to minimize jointly the current errors madeby the network on each of the 40 streams. Because the EKF method isused, updates are not merely a simple average of the updates that wouldhave been computed by the individual streams.

The database used for network training was acquired by operating aproduction vehicle over a wide range of operation, including enginespeed-load combinations that would rarely be encountered in normaldriving. Misfire events are deliberately introduced (typically byinterrupting the spark) at both regular and irregular intervals. Misfirealters the torsional oscillation pattern for several subsequent timesteps, so it is desirable to provide the network with a range of misfireintervals for all combinations of speed and load that correspond topositive engine torque. Though the data set used for training includedmore than 600,000 examples (one per cylinder event), it necessarily onlyapproximates full coverage of the space of operating conditions andpossible misfire patterns. Further discussion of training details may befound in co-pending application Ser. No. 08/744,258, filed Nov. 6, 1996,which is incorporated herein by reference.

Real-time signal processing requirements, as in the case or the enginemisfire detection problem, typically require implementation in dedicatedor algorithm specific VLSI hardware. Sample rate requirements for thebroad class of automotive engine diagnostics can vary from low tomoderate. This combination of low to moderate sample rates coupled withthe a priori specification for cost effective hardware suggested aneuroprocessor based on a compact bit-serial computational architecture.Bit-serial implementations process one input bit at a time, and aretherefore ideal for low to moderate speed applications and are compactand therefore cost-effective. In contrast, bit-parallel systems processall input bits of a sample in one clock cycle, and therefore requiremore silicon area, interconnections, and pin-outs.

The bit-serial architecture of a neuron is shown in FIG. 5. Since thefunctionality of a neuron is to accumulate products formed from themultiplication of synaptic weights by corresponding stimulatingactivations, the VLSI silicon neuron closely models this functionality.In the silicon embodiment, a digital neuron consists of three functionalbuilding blocks. The blocks are pipelined and consist of a bit-serialmultiplier 56, a bit-serial accumulator 58, and a bit-parallel outputdata latch 60. The input activations are provided to the multiplier in abit-parallel fashion whereas the input synaptic weights are provided ina bit-serial fashion. The multiplier computes the product and makes itavailable to the accumulator in a bit-serial fashion. The accumulator 58forms the products in a pipelined fashion as the data becomes availablefrom the multiplier. The accumulated sum of the products is temporarilystored in the tri-state data latch 60.

Significant streamlining of the architecture, and hence better hardwareutilization, is achieved by adopting a time multiplexed layers approach.Time multiplexing of layers refers to reusing the hardware used incalculating the activation of neurons in one layer for the calculationof neuron activations in another layer. Since neuro-computations aretypically performed a layer at a time, multiplexing increases theutilization of hardware that would otherwise sit idle and leads to adirect reduction of the required silicon area. By reusing the circuitrydedicated to one layer during the evaluation of the next layer onlyenough hardware to accommodate the layer with the largest number ofneurons, needs to be incorporated into the hardware chip. Other smallerlayers can then reuse these portions of hardware during their execution.

The multiplexing of layers is shown diagrammatically in FIG. 6. The timemultiplexing or sequential processing issues become clearer once theflow of information in the neural network upon initiation of acomputation is understood. If input sensory data are presented to theneural network's four inputs at time t=0, the only active computationsbeing performed in the network are strictly limited to those neuronsreceiving stimuli from the input layer neurons, i.e., neurons lyinguniquely in the first hidden layer. All other neurons remain totallyinactive. If the computation time of the neuron is defined by T, then attime t=T, all neurons in the first hidden layer will have computed theiractivations. Neurons in the first hidden layer can now play a passiverole and simply broadcast their activations to neurons in the nextlayer, i.e., the second hidden layer, in a similar fashion. At thistime, the only active neurons are those in the second hidden layer. Thecomputation proceeds a layer of neurons at a time until the outputneuron's activation is finally calculated. Thus, computations in aneural network are strictly performed a layer at a time sequentiallyprogressing through the hierarchy of layers that compose the networkarchitecture. In the example of FIG. 6, the assigned neuron resources (5of 16) for the hidden layer at time T0 is indicated at 62, and theassigned neuron resources (2 of 16) for the output layer at time T1 isindicated at 64. It will be understood that with respect to the misfireapplication shown in FIG. 4, there will be three multiplexing timeperiods T0-T2 for the two hidden layers 52 and 54 and the output layer.

As discussed previously, the basic element in neurocomputation is theneuron—which is a simple processing element. Neurons can beinterconnected in various topologies by means of synaptic weights.Typically, neurons are organized into computational layers. Thougharbitrarily complex network architectures can be constructed with theselayers—architectures with multiple levels of hierarchy andinterconnectivity—practical applications intimately link the neuralnetwork structure to its proposed functional use. The simplest form ofnontrivial network is the multilayer perceptron shown in FIG. 4. Theperceptron has an input layer of source nodes, any number ofintermediate hidden layers, and a final layer of output neurons. Theoutput signals of the neurons in the final layer of the network togetherconstitute the overall response to the activation pattern supplied tothe source nodes on the input layer.

Let the neurons be indexed by the subscript j. Then the total input,x_(j) to neuron j, is a linear function of the outputs, y_(i), of allthe neurons that are connected to neuron j and of the weights w_(ij) onthese connections, i.e., $\begin{matrix}{x_{j} = {\sum\limits_{i}\quad {y_{i}w_{ij}}}} & (1)\end{matrix}$

Neurons are usually provided with an additional stimuli in the form of abias which has a value of 1, by introducing an extra input to each unit.The weights on this extra unit are called the bias weights and areequivalent to a threshold. Neurons have real-valued outputs, y_(j),which are a nonlinear function of their inputs. The exact form of thisequation can vary depending on the application at hand. The activationfunction used in this VLSI architecture, the bipolar sigmoid, is givenin equation (2). $\begin{matrix}{y_{j} = {{- 1} + \frac{2}{1 + ^{- x_{j}}}}} & (2)\end{matrix}$

Referring now to FIGS. 7 and 8, the architecture of the single chipstand-alone neuroprocessor 50 is shown. The chip was designed with thegoal of minimizing the size of the neuroprocessor while maintaining thecomputational accuracy required for automotive diagnostic and controlapplications. The neuroprocessor architecture comprises a sixteen neuronmodule 70, a global controller 72, a sigmoid activation ROMlook-up-table 74, neuron state RAM registers 76, and synaptic weight RAMregisters 78. The controller 72 is shown in greater detail in FIG. 8.The sixteen neurons perform the neuronal multiply and accumulateoperations. They receive as input the synaptic weights and activationsfrom input nodes or from neurons on a previous layer in a bitserial-parallel fashion, and output the accumulated sum of partialproducts as given by equation (2). Because of the computational natureof neural networks—where information is sequentially computed a layer ata time—only enough neurons need be physically implemented in actualsilicon as are required by the largest layer.

For specific recurrent neural network applications (e.g., misfire, idlespeed control), a candidate pool of sixteen silicon neurons issufficient. As previously stated, time multiplexing of layers permitsthe sixteen neurons to be re-utilized on successive layers. Thistime-multiplexing of layers radically streamlines the architecture bysignificantly increasing hardware utilization through reuse of availableresources.

The global controller 72 enables the neurochip to execute its requiredtask of generating necessary control logic as well as orchestrating datamovement in the chip. When there are no computations being performed,the global controller remains in the idle state, signalling itsavailability by having the active low {overscore (BUSY)} flag set high.When a RUN command is issued, the global controller is in charge ofproviding control signals to the sixteen on-chip neurons, the RAM andthe ROM in order to proceed with the desired neurocomputation. Inputactivations are read out of the 64×16 Neuron State RAM 76, synapticweights are read out of the 2K×16 Synaptic Weight RAM 78, and both arepropagated to the bank of 16 neurons 70. The global controller not onlykeeps track of intra-layer operations, but inter-layer operations aswell. Upon completion of a forward pass through the networkarchitecture, the global controller asserts the {overscore (BUSY)} flagand returns to the idle state.

With reference to FIG. 8, the global controller is made up of aconfiguration controller 82, and a run-time controller 84. Configurationof the hardware is performed by the configuration controller andrequires the loading of five 16-bit registers that together explicitlydefine the topology of the recurrent neural network architecture.

The configuration controller 82 accepts as input, 16-bit data on bus D,a 3-bit address on bus A, a configuration control signal CFG, a clock C,and a global reset signal R. All signals feeding into the configurationcontroller are externally accessible. The 3-bit address bus internallyselects one-of-five 16 bit configuration registers as the destination ofthe 16-bit data source D. By strobing the CFG control line, data can besynchronously loaded into any of the five architecture registers RA-RE.From an implementation perspective, the first four registers, registersRA-RD, uniquely define the topology of each layer in the neural networkarchitecture. Thus, with this architecture there can be at most 4 layersin any recurrent neural network application—i.e., an input layer, anoutput layer, and two hidden layers. The 16-bit registers RA through RDeach contain layer specific bit-fields (such as the number of neurons inthe current layer and the number of recurrent connections within thelayer) that collectively define the neural topology. Register RE definesthe number of non-input layers in the neural network topology and sincethe number of layers is restricted to 4, only the lowest 2-bits are ofsignificance. Once the five configuration registers are loaded, a uniquenetwork topology is defined, and the global controller can proceed tothe run-time mode.

Once the configuration registers are loaded, control is passed to therun-time controller 84. At this stage, 2's complement binary coded datarepresenting the engine sensor input quantities that need to beprocessed by the neural network are loaded into the neuron state RAMmodule 76 at appropriate memory locations. The module 84 remains in theidle mode for as long as the RUN line remains low. The low to hightransition on the RUN line immediately resets the {overscore (BUSY)}flag and initiates the execution of a single forward pass of the controlhierarchy using the registers RA through RE as a template that definesthe network's topology. The {overscore (BUSY)} flag remains low untilthe neural network has completed the neurocomputation. It subsequentlyreturns high after the contents of the output layer of the neuralnetwork have been placed back into appropriate memory locations of theneuron state RAM module 76. Once the {overscore (BUSY)} flag goes high,the contents of the neuron state RAM module are made available to theexternal world, and can be retrieved by the appropriate toggling of theRAM control lines. In this fashion, the output of the network can beread out and fresh inputs can be loaded into the hardware. The neuronstate RAM module 76 is a single port RAM module, so once the neuralnetwork begins computations, the RAM module is inaccessible.

The run-time global controller 84 is shown in greater detail in FIG. 9.It is made up of four distinct logic blocks. They are: a current layerregister selector 90; a finite state machine 92 in charge of sequencinghigh-level inter-layer operations; an intra-layer propagation controller94; and an intra-layer specific neuron output data storage controller96.

When the RUN command is issued to the run-time controller 84, statemachine 92 begins execution by clearing the {overscore (BUSY)} flag, thecurrent layer register selector 90, the propagation controller 94, andthe storage controller 96. The current layer controller has access toall four configuration registers, RA through RD. Upon reset, selector 90points to the RA register (which defines the input layer topology) andthereby propagates its contents to the propagation and storagecontrollers, 94 and 96 respectively. The state machine 92 then passescontrol to the propagation controller 94 by toggling the RUN pin oncontroller 94 and goes into an idle mode. The role of the propagationcontroller 94 is to oversee the execution of the neuron multiply andaccumulate operations. This is achieved by providing the necessarycontrol logic and precise synchronization of data flow out of both theneuron RAM 76 and the synapse RAM 78 into the 16 element bank ofbit-serial neurons. The propagation controller 94 therefore generates(1) the addresses ANRAM[5:0] and control signals CNRAM[2:0] to theneuron RAM 76; and (2) the addresses AWRAM[5:0] and control signalsCWRAM to the synaptic weight RAM 78. The control signals that thepropagation controller 94 also generates are the control lines CN[3:0]to the neuron block 70 which include commands to clear the multipliersand accumulators. The OEBIAS signal allows the propagation of a biasterm, indicated at 80 in FIG. 7, to the neurons 70. The bias term ispropagated on the data bus to the neurons, in much the same way as theneuron inputs from the neuron storage RAM 76. When the bias term isinvoked, the neuron RAM outputs are simply tri-stated.

Upon completion of the propagation controller task, the linearactivation for all neurons in the current layer have been calculated, asgiven by equation (1). The state machine 92 then passes execution to thestorage controller 96 by toggling its RUN pin. The responsibility of thestorage controller is to calculate the non-linear activations for theneurons, as per equation (1), linear activation of which was justcalculated, and subsequently store the resulting quantities in RAM 76.This is achieved by sequentially enabling the linear output of eachneuron on that layer, allowing the signal to propagate through thebipolar sigmoid activation look-up-table (LUT) 74, and storing theresult in an appropriate memory location in RAM 76. Upon completion, thestorage controller 96 returns control to the state machine 92. Whenactive, the controller 92 generates the addresses ANRAM[5:0] and controlsignals CNRAM[2:0] to the neuron RAM 76, sequentially enables output ofthe active neurons via the OEN control lines, and enables access of theoutput from the LUT onto the neuron data input bus. When controller 92completes execution, a full forward pass has been completed for a singlelayer of the recurrent neural network architecture. The state machineincrements internal layer counters, and checks to see if there are anyadditional layers in the neural network topology that need to becalculated. If there are, the above process is repeated. If all layershave been computed and neuron outputs have been stored in RAM 76, thecontroller sets the {overscore (BUSY)} flag, and returns to the idlemode. When the {overscore (BUSY)} flag is high, data can be read fromall RAM memory locations, and the results of the neurocomputation can beoff-loaded to the external world. This completes the execution of theneurocomputation.

The multiplier 56 is shown in greater detail in FIG. 10. The multiplier56 is used to perform the synaptic multiplications required by theneural network architecture. Application constraints called for a16×16-bit multiplier. In operation, the multiplier accepts as inputeither (1) an input stimulus to the neural network or (2) the activationoutput from a neuron on a previous layer. It multiplies this quantity bythe corresponding synaptic weights. The input stimulus (for activationoutputs) is presented to the multiplier in a fully parallel fashionwhile the synaptic weights are presented in a bit-serial fashion. Theserial output of the multiplier feeds into the accumulator.

Any size multiplier can be formed by cascading the basic multipliercell. The bit-wise multiplication of the multiplier and multiplicand isperformed by the AND gates 100 a-100 n. At each clock cycle, the bank ofAND gates therefore compute the partial product terms of a multiplier Yand the current multiplicand X(t). Two's complement multiplication isachieved by using XOR gates 102 a-102 n connected with the outputs ofthe AND gates and providing inputs to full adders 104 a-104 n. Bycontrolling one of the inputs on the XOR gate, the finite state machine106 can form the two's complement of selected terms based on its controlflow. In general, for a n×n multiplier resulting in a 2n bit product,the multiplier can be formed using 2n basic cells and will perform themultiplication in 2n+2 clock cycles. Successive operations can bepipelined and the latency of the LSB of the product is n+2 cycles. Inthis implementation, n=16.

The accumulator 58 is shown in FIG. 11 and comprises a single bit-serialadder 110 linked to a chain of flip-flops generally indicated at 112.The bit-serial adder is made up of a single full adder and a flip-flopto store the carry bit. The length of the accumulator chain iscontrolled by the multiplication which takes 2n+2 clock cycles toperform a complete multiplication. At each clock cycle, the accumulatorsums the bit from the input data stream with both the contents of thelast flip-flop on the chain as well as the carry bit, if any, generatedfrom the last addition operation a clock cycle before. This value issubsequently stored into the first element of the chain. This creates acirculating chain of data bits in the accumulator. In operation, theadder's flip-flop is reset prior to the accumulation of a sum.

While the best mode for carrying out the present invention has beendescribed in detail, those familiar with the art to which this inventionrelates will recognize various alternative designs and embodiments forpracticing the invention as defined by the following claims.

What is claimed is:
 1. A system comprising: a neuroprocessor forimplementing a reconfigurable network topology including a plurality ofhidden layers containing neurons interconnected in a recurrentconfiguration; the neuroprocessor being responsive to an input patterncorresponding to processed real time engine operating conditions fordetermining whether a fault condition has occurred; the neuroprocessorcomprising a neural module including a plurality of bit-serial neurons;a global controller for time multiplexing groups of neurons from theneural module to form first and second hidden layers of the networktopology, the controller controlling application of the input pattern tothe first hidden layer and controlling storage of the output of saidfirst hidden layer for subsequent application as input to said secondhidden layer.
 2. The system defined in claim 1 configured to detectengine misfire, said input pattern includes a crankshaft accelerationinput, an engine load input, an engine speed input and a cylinderidentification input, the neuroprocessor producing an output indicatingwhether a misfire has occurred.
 3. The system defined in claim 2 whereineach neuron comprises a bit-serial multiplier for multiplying first andsecond inputs, the controller sequentially applying to said first inputof said multiplier one input of said input pattern or the activationoutput of a neuron on a previous layer, the controller applying asynaptic weight appropriate for said one input of said input pattern tosaid second input of said multiplier, each neuron further comprises abit-serial accumulator for accumulating the output of said multiplier.4. The system defined in claim 3 wherein the input bits are provided tothe multiplier in parallel and the input synaptic weight bits areprovided to the multiplier serially, the multiplier computing theproduct of the two inputs and making the results available to theaccumulator on a bit-serial basis.
 5. The system defined in claim 4wherein the accumulator comprises a cyclical shift register with anadder at the input stage allowing the pipelining of outputs from themultiplier to be accumulated as the data is available from themultiplier.
 6. The system defined in claim 5 wherein upon completion ofall multiply and accumulate operations, a tri-state data latch storesthe relevant accumulated sum until required by the controller.
 7. Thesystem defined in claim 6 wherein inputs are obtained from an enginecontroller.
 8. An engine misfire detection system comprising: aneuroprocessor for implementing a reconfigurable network topologyincluding a plurality of hidden layers containing neurons interconnectedin a recurrent configuration; the neuroprocessor being responsive to aninput pattern for determining whether a misfire has occurred, said inputpattern corresponding to processed real time engine operatingconditions; the neuroprocessor comprising a neural module including aplurality of bit-serial neurons; a global controller for timemultiplexing groups of neurons from the neural module to form first andsecond hidden layers of the network topology, the controller controllingapplication of the input pattern to the first hidden layer andcontrolling storage of the output of said first hidden layer forsubsequent application as input to said second hidden layer; said globalcontroller comprising a configuration controller, and a run-timecontroller, the configuration controller including a plurality ofconfiguration registers containing data that explicitly defines thetopology of each layer of the recurrent neural network architecture. 9.The system defined in claim 8 wherein the configuration controllerincludes a data bus, an address bus, and receives a configurationcontrol signal, a clock signal, and a reset signal, the address on theaddress bus internally selecting a configuration register as thedestination of data on the data bus.
 10. The system defined in claim 9wherein the neuroprocessor further includes a neuron state RAM modulefor storing the contents of the output layer of the neural network. 11.The system defined in claim 10 wherein the neuron state RAM module is asingle port RAM module.
 12. The system defined in claim 9 wherein therun-time controller comprises a current layer register selector, afinite state machine for sequencing high-level inter-layer operations,an intra-layer propagation controller for controlling execution ofneuronal multiply and accumulates, and an intra-layer specific neuronoutput data storage controller for controlling calculation of non-linearactivations for the neurons whose linear state is given in accordancewith the following equation:$x_{j} = {\sum\limits_{i}\quad {y_{i}w_{ij}}}$

and for subsequently storing the resulting quantities in the neuronstate RAM.
 13. The system defined in claim 12 wherein the neuroprocessorfurther includes a sigmoid activation look-up-table for performing thenon-linear activation function.
 14. The system defined in claim 13wherein said input pattern includes a crankshaft acceleration input, anengine load input, an engine speed input, a cylinder identificationinput, and a bias input.
 15. The system defined in claim 14 wherein eachneuron comprises a bit-serial multiplier for multiplying first andsecond inputs, the controller sequentially applying one input of saidinput pattern or the activation output of a neuron on a previous layerto said first input of said multiplier, the controller applying asynaptic weight appropriate for the input to said second input of saidmultiplier, each neuron further comprises a bit-serial accumulator foraccumulating the output of said multiplier.
 16. The system defined inclaim 15 wherein the input bits are provided to the multiplier inparallel and the input synaptic weight bits are provided to themultiplier serially, the multiplier computing the product of the twoinputs and making the results available to the accumulator on abit-serial basis.
 17. The system defined in claim 16 wherein theaccumulator comprises a cyclical shift register with an adder at theinput stage allowing the pipelining of outputs from the multiplier to beaccumulated as the data is available from the multiplier.
 18. The systemdefined in claim 17 wherein upon completion of all multiply accumulates,a tri-state data latch stores the relevant accumulated sum untilrequired by the controller.
 19. The system defined in claim 18 whereincertain ones of the inputs in said pattern are sensor data processed byan engine controller prior to input to said neuroprocessor.
 20. Anengine misfire detection system comprising: a neuroprocessor forimplementing a reconfigurable network topology including a plurality ofhidden layers containing neurons interconnected in a recurrentconfiguration; said neuroprocessor being responsive to an input patternfor determining whether a misfire has occurred, said input patterncorresponding to processed real time engine operating conditions; saidneuroprocessor comprising a neural module including a plurality ofbit-serial neurons; a global controller for time multiplexing groups ofneurons from the neural module to form first and second hidden layers ofthe network topology, the controller controlling application of theinput pattern to the first hidden layer and controlling storage of theoutput of said first hidden layer for subsequent application as input tosaid second hidden layer; said global controller comprising aconfiguration controller, and a run-time controller, the configurationcontroller including a plurality of configuration registers containingdata that explicitly defines the topology of each layer of the recurrentneural network architecture; said global controller including a run-timecontroller for initiating a neurocomputation using the data in saidconfiguration registers, and a neuron state RAM module for storing thecontents of the output layer of the neural network; said run-timecontroller comprising a current layer register selector, a finite statemachine for sequencing high-level inter-layer operations, an intra-layerpropagation controller for controlling execution of neuronal multiplyand accumulates, a sigmoid activation look-up-table for performing anon-linear activation function and an intra-layer specific neuron outputdata storage controller for controlling calculation of non-linearactivations for the neurons and for subsequently storing the resultingquantities in said neuron state RAM, said input pattern including acrankshaft acceleration input, an engine load input, an engine speedinput, a cylinder identification input, and a bias input.